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INTRODUCTION: 


The  primary  focus  toward  identification  of  Alzheimer  disease  (AD)  risk  genes  over  the  past  five  years  has  been 
testing  the  common  disease  common  variant  (CDCV)  hypothesis  through  the  use  of  genome-wide  association 
studies  (GW AS)  in  late  onset  Alzheimer  disease  (LOAD).  While  common  variation  clearly  plays  a  role  in  AD 
there  is  a  growing  realization  that  the  CDCV  hypothesis  is  unlikely  to  explain  all  the  genetic  effect  underlying 
AD.  One  alternative  hypothesis  invokes  multiple  rare  variants  (RV)  in  one  or  more  genes,  each  with  stronger 
individual  effects  than  CDCV  genes.  We  designed  this  project  to  test  the  rare  variant  hypothesis  in  AD  by 
examining  those  cases  with  the  most  severe  phenotype  as  determine  by  early  onset  (EOAD,  cases  with  AAO  < 
60  years).  Although  there  are  three  known  EOAD  genes  (PS1 ,  PS2  and  APP)  they  account  for  only  -60-70% 
of  familial  EOAD  and  even  less  of  sporadic  EOAD.  Thus,  the  majority  of  the  genetics  of  EOAD  remains 
unknown.  Until  now,  large  extended  families  with  AD  in  multiple  generations  were  necessary  to  identify 
variants  of  significant  effect  contributing  to  AD  risk,  however,  with  the  advent  of  new  genomic  technologies 
such  as  high-throughput  sequencing  technology,  small  family  aggregates  and  isolated  cases,  particularly  those 
with  an  extreme  phenotype  of  the  disorder  (such  as  early  onset)  can  be  used.  Thus,  we  will  utilize  whole 
exome  high-throughput  sequencing  to  identify  high  risk  AD  variants  that  we  will  further  characterize  with 
respect  to  AD.  We  will  examine  both  Caucasian  and  Caribbean  Hispanic  AD  populations.  Our  two  pronged 
approach  includes  structural  characterization  at  the  DNA  level  (Dr.  Pericak-Vance),  and  analysis  of  Caribbean 
Hispanics  (Dr.  Richard  Mayeux).  Comparing  across  populations  will  be  extremely  useful.  Specifically,  high 
priority  RVs  identified  through  the  whole  exome  analysis  will  be  further  explored  with  multiple  strategies.  We 
will  also  genotype  the  interesting  variants  in  a  large  sample  of  late-onset  (LOAD)  cases  to  examine  their 
involvement  in  all  AD.  We  will  thus  prepare  a  list  of  high  priority  candidates  for  additional  follow-up  and 
functional  analysis. 


4 


BODY: 

WES  and  variant  prioritization 

Whole  exome  sequencing  (WES),  quality  control  and  variant  calling,  variant  annotation,  and  variant  filtering  is 
complete  on  55  samples  submitted  by  Columbia  University  to  the  University  of  Miami.  Additionally,  WES  and 
analysis  of  51  samples  from  46  multiplex  families  from  The  University  of  Miami  and  Vanderbilt  University  is 
complete.  Identity-by-descent  analysis  of  Hispanic  families  was  also  performed.  Following  these  analyses, 
comparison  of  the  candidate  variants/genes  shared  across  Hispanic  families  and  NH-white  cases  was  done. 
From  these  analyses,  a  list  of  125  unique  variants  was  prioritized  for  follow-up  genotyping. 

A  brief  overview  of  how  each  family  was  filtered  individually  and  how  variants  for  typing  were  prioritized  follows: 

1)  Quality  Filter  per  individual  WES  sample:  VQSLOD>0,  PL  Score>100,  Read  Depth>6 

2)  Annotation  of  remaining  variants  with  ANNOVAR 

3)  Remove  variants  with  MAF>0.001  in  EVS_6500si  and  1000G2012mar_all  and  MAF>0.01  in  HIHG  internal 
controls 

4)  Keep  variants  with  Autosomal  dominant  and  X-linked  dominant  segregation  in  family 

4)  Exclude  variant  if  not  missense.  Splicing,  Stopgain,  Stoploss,  Nonframeshift  Indel,  or  Frameshift  Indel  in 
refSeq  gene  annotation.  Ensemble  gene  annotation,  or  UCSC  Known  gene  annotation 

5)  Filter  on  deleteriousness  based  on  a)  damaging  score  in  any  of  these  7  programs  programs:  Sift, 
Polyphen2_HDIV,  LRT,  MutationTaster,  MutationAssessor,  or  FATHMM  and  b)  conservation  based  on  a 
conserved  score  in  any  of  these  3  programs:  GERP,  SiPhy  or  PhyloP 

7)  Apply  IBD  sharing  results  and  require  1 00%  sharing  in  Hispanic  families  with  enough  GWASed  individuals 

8)  Genotype  any  variant  passing  above  filters  and  in  a  known  EOAD  or  LOAD 

9)  Interrogate  shared  variants  and  variants  in  shared  genes  across  Hispanic  Families  and  between  Hispanic 
and  NH-White  Families  by  screening  them  for  existence  and  potentially  too  high  a  MAF  in  dbSNP,  EVS,  1000G 
updates,  specific  1000G  populations  (EA,  AA,  AMR  and  ASN,  and  any  population  in  UCSC),  and  cg69  (69 
complete  genomics  exomes).  Because  of  the  large  amount  of  candidate  genes  generated  from  filtering  of  the 
NH-White  cases,  a  variant  from  the  comparison  of  Hispanic  and  NH-White  candidates  was  only  carried  forward 
forgenotyping  if  the  variant/gene  passed  this  screening  and  was  in  2+  Hispanic  families  and  2+  NH-White 
cases.  Additionally,  variants/genes  still  in  2+  Hispanic  families  after  the  screening  were  carried  forward  for 
genotyping. 

8)  Additional  variants  were  selected  by  applying  a  'secondary  filter'  to  the  Hispanic  families  in  order  to  reduce 
single  variant  per  family  candidates: 

— remove  any  SNV  with  an  rs#  in  dbSNP129-dbSNP137 
— remove  all  indels 

— remove  families  with  greater  than  50  variants  remaining  (families  1 ,171,386  and  419) 

— keep  only  variants  predicted  to  be  damaging  in  3  or  more  of  the  7  prediction  programs  used 
— NOTE:  Candidate  variants  for  the  four  removed  families  were  selected  based  on  shared  variants/genes  with 
other  families. 

Foiiow-up  Genotyping  of  Top  Candidate  Variants 

261  Hispanic  familial  subjects  from  19  pedigrees  (145  affecteds  and  116  unaffecteds)  and  500  Hispanic  non- 
familial  subjects  (382  healthy  controls  and  118  sporadic  EOAD  cases)  were  genotyped  for  these  125  top 
candidate  variants.  101  of  the  variants  passed  all  QC  filters  (13  variants  failed  genotyping  and  11  were 
monomorphic  in  the  dataset).  For  analysis  of  results  of  this  follow-up  genotyping  we:  1 )  estimated  familial  and 
population  frequencies  of  the  variants  in  our  follow-up  cohort  and  2)  tested  single  SNV  association  with  AD 
with  2  models  using  generalized  estimation  equations  (GEE): 

Ml)  AD~SNV+AGE+SEX 
M2)  AD~SNV+AGE+SEX+APOE 

19  top  candidate  variants  were  identified  from  this  follow-up  genotyping.  They  include  8  variants  that  show 
perfect  segregation  with  AD  status  in  the  families  and  are  absent  in  population  controls.  These  variants  are  in 
the  genes  MY03A,  AAAS,  DICER1,  YIPF1,  ACAP1,  LLGL2,  BPIFB2,  and  ABCG2.  An  additional  11  variants 
were  identified  as  follow-up  candidates  based  on  them  showing  near  complete  segregation  (absent  in  one  or  a 
few  familial  cases)  and  being  absent  in  all  familial  and  sporadic  controls.  These  variants  are  in  the  genes 
GPR26,  ERCC6,  OR5M9,  DNAH3,  MYOCD,  KIF17,  TICRR,  PLXNB2,  LAMA2,  SNRNP48,  andGLB1L2. 

These  top  19  variants  are  now  being  genotyped  in  a  cohort  of  African-Americans  (157  familial  cases,  400 
sporadic  cases,  and  942  unrelated  cases)  and  Non-Hispanic  Caucasians  (2,377  familial  cases,  739  sporadic 
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cases,  and  600  unrelated  cases).  All  high  priority  variants  are  being  genotyped  in  our  large  LOAD  case  control 
and  family  based  datasets  of  over  4000  individuals 
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KEY  RESEARCH  ACCOMPLISHMENTS: 

•  Variant  calling  and  quality  control  processing  of  these  samples  completed  on  55  Hispanic  individuals 
submitted  by  Columbia  and  51  NH-White  samples  from  the  University  of  Miami  and  Vanderbilt 
University. 

•  Analysis  (variant  annotation  and  filtering)  completed  on  samples  of  55  Hispanic  individuals  submitted 
by  Columbia  and  51  NH-White  samples  from  the  University  of  Miami  and  Vanderbilt  University. 

•  Identity-by-descent  analysis  of  Hispanic  families  is  complete. 

•  Identification  of  125  top  candidate  variants  for  follow-up  genotyping  is  complete. 

•  Genotyping  of  125  top  candidate  variants  in  the  Hispanic  families  and  a  cohort  of  500  Hispanic  cases 
controls  is  complete. 

•  Analysis  of  the  125  top  candidate  variants  in  the  Hispanic  families  and  a  cohort  of  500  Hispanic  cases 
and  controls  is  complete,  with  19  top  candidates  identified  for  follow-up. 

•  Comparison  of  the  top  19  Hispanic  candidates  from  the  follow-up  genotyping  to  the  Caucasian  EOAD 
WES  samples  is  ongoing. 

•  Analysis  of  candidate  variants/loci  in  our  large  LOAD  case  control  data  set  is  ongoing. 
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REPORTABLE  OUTCOMES: 


Platform  Presentation  (Appendix  I): 

Kunkle  BW,  Kohli  MA,  Vardarajan  BN,  Reitz  C,  Naj  AC,  Whitehead  PL,  Martin  ER,  Beecham  GW,  Gilbert  JR, 
Farrer  LA,  Haines  JL,  Schellenberg  GD,  Mayeux  RP,  Pericak-Vance  MA,  Alzheimer’s  Disease  Genetics 
Consortium.  Whole-exome  sequencing  in  early-onset  Alzheimer  disease  families  identifies  rare  variants  in 
multiple  Alzheimer-related  genes  and  processes.  The  63rd  Annual  Meeting  of  the  American  Society  of  Human 
Genetics  (ASHG),  Boston,  MA,  October  22-26,  2013. 


Accepted  for  Platform  Presentation  (Appendix  II): 

Reitz  C,  Kunkle  BW,  Vandarajan  BN,  Kohli  MA,  Naj  AC,  Whitehead  PL,  Perry  WR,  Martin  ER,  Beecham  GW, 
Gilbert  JR,  Farrer  LA,  Haines  JL,  Schellenberg  GD,  Pericak-Vance  MA,  Mayeux  RP,  Alzheimer’s  Disease 
Genetics  Consortium.  Whole-exome  sequencing  of  Hispanic  early-onset  Alzheimer  disease  families  identifies 
rare  variants  in  multiple  Alzheimer-related  genes.  The  American  Academy  of  Neurology  (AAN)  66*^  Annual 
Meeting,  Philadelphia,  PA,  April  26-May  3,  2014. 


CONCLUSION: 


Mutations  in  APP,  PSEN1  and  PSEN2  lead  to  familial  EOAD  and  accounting  for  60-70%  of  familial  EOAD  and 
~1 1  %  of  EOAD  overall,  leaving  the  majority  of  genetic  risk  for  this  form  of  Alzheimer  disease  unexplained.  We 
performed  Whole-Exome  Sequencing  (WES)  on  55  individuals  in  19  Caribbean  Hispanic  EOAD  families  and 
51  Non-Hispanic  White  EOAD  cases  previously  screened  negative  for  APP,  PSEN1  and  PSEN2  to  search  for 
rare  variants  contributing  to  risk  for  EOAD.  Variants  were  filtered  for  segregating,  conserved  and  functional 
rare  variants  (MAF<0.1%)  assuming  both  autosomal  and  X-linked  dominant  models.  125  rare,  segregating, 
conserved  and  functional  variants  passed  our  stringent  filtering  criteria  for  selection  of  follow-up  genotyping 
candidates.  These  variants  have  undergone  follow-up  genotyping  for  segregation  in  the  families  and  for 
presence  in  a  cohort  of  500  Hispanic  cases  and  controls.  19  top  candidate  variants  were  identified  from  this 
follow-up  genotyping.  They  include  8  variants  that  show  perfect  segregation  with  AD  status  in  the  families  and 
are  absent  in  population  controls.  These  variants  are  in  the  genes  MY03A,  AAAS,  DICER1,  YIPF1,  ACAP1, 
LLGL2,  BPIFB2,  and  ABCG2.  An  additional  1 1  variants  were  identified  as  follow-up  candidates  based  on  them 
showing  near  complete  segregation  (absent  in  one  or  a  few  familial  cases)  and  being  absent  in  all  familial  and 
sporadic  controls.  These  variants  are  in  the  genes  GPR26,  ERCC6,  OR5M9,  DNAH3,  MYOCD,  KIF17,  TICRR, 
PLXNB2,  LAMA2,  SNRNP48,  and  GLB1L2.  These  top  19  variants  are  now  being  genotyped  in  a  cohort  of 
African-Americans  (157  familial  cases,  400  sporadic  cases,  and  942  unrelated  cases)  and  Non-Hispanic 
Caucasians  (2,377  familial  cases,  739  sporadic  cases,  and  600  unrelated  cases).  All  high  priority  variants  are 
being  genotyped  in  our  large  LOAD  case  control  and  family  based  datasets  of  over  4000  individuals. 
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APPENDICES; 


Appendix  I: 

Whole-exome  sequencing  in  eariy-onset  Aizheimer  disease  famiiies  identifies  rare  variants  in 
muitipie  Aizheimer-reiated  genes  and  processes 

Brian  W.  Kunkle\  Martin  A.  Kohii\  Badri  N.  Vardarajan^,  Christiana  Reitz^,  Adam  C.  Naj^,  Patrice  L.  Whitehead\  Eden  R. 
Martin\  Gary  W.  Beecham\  John  R.  Giibert\  Lindsay  A.  Farrer^,  Jonathan  L.  Haines"^,  Gerard  D.  Scheiienberg®,  Richard 
P.  Mayeux^,  Margaret  A.  Pericak-Vance\  and  The  Aizheimer’s  Disease  Genetics  Consortium. 

^  John  P.  Hussman  Institute  for  Human  Genomics,  University  of  Miami,  Miami,  FL,  USA 
^  Taub  Institute  of  Research  on  Alzheimer’s  Disease,  Columbia  University,  New  York,  NY,  USA 
®  School  of  Medicine,  Boston  University,  Boston,  MA,  USA 
"'Center for  Human  Genetics  Research,  Vanderbilt  University,  Nashville,  TN,  USA 
®  Perelman  School  of  Medicine,  University  of  Pennsylvania,  Philadelphia,  PA,  USA 

Background 

Mutations  in  APP,  PSEN1  and  PSEN2  iead  to  famiiiai,  eariy-onset  Aizheimer  disease  (EGAD).  These  mutations  account 
for  oniy  60-70%  of  famiiiai  EOAD  and  -11%  of  EOAD  overaii,  ieaving  the  majority  of  genetic  risk  for  the  most  severe  form 
of  Aizheimer  disease  unexpiained. 

Methods 

We  performed  Whoie-Exome  Sequencing  in  Caribbean  Hispanic  and  Caucasian  EOAD  famiiies  previousiy  screened 
negative  for  APP,  PSEN1,  and  PSEN2  to  search  for  rare  variants  contributing  to  risk  for  EOAD.  60  individuais  in  21 
famiiies  were  sequenced  using  the  Agiient  50Mb  kit  on  an  liiumina  HiSeq2000.  Variant  fiitering  for  segregating,  conserved 
and  functionai  rare  variants  (MAF<0.1%)  was  performed  on  the  21  famiiies  assuming  both  autosomai-dominant  and  X- 
iinked  dominant  modeis.  Fiitered  ioci  were  examined  for  impiication  as  AD  candidate  genes  from  GW  AS  or  in  bioiogicaiiy 
reievant  KEGG  Pathways.  Variants  were  aiso  foiiowed  up  for  association  with  AD  in  13,748  individuais  (7,652  affected) 
from  the  Aizheimer’s  Disease  Genetics  Consortium  (ADGC)  genotyped  on  the  exome  chip,  which  inciuded  195,039 
variants  with  MAF<2%.  Enrichment  anaiysis  of  the  variant  iist  was  conducted  using  DAVID. 

Results 

984  variants  in  886  genes  passed  our  stringent  fiitering  criteria,  inciuding  63  genes  with  rare  segregating,  conserved  and 
functionai  variants  in  two  or  more  famiiies.  A  frameshift  mutation  in  ABCA7  and  a  missense  variant  in  ZCWPW1  are 
present  in  one  of  the  23  GWAS-confirmed  Aizheimer  disease  candidate  genes.  Seven  variants  are  in  AD  KEGG  Pathway 
genes  {BID,  CYC1,  ITPR1,  ITPR2,  LRP1,  ATP2A1),  inciuding  two  variants  in  LRP1,  a  gene  invoived  in  AD  through  its 
roies  in  choiesteroi  transport  and  p-amyioid  moduiation.  Foiiow  up  in  ADGC  exome  chip  association  resuits  comparing 
EOAD  vs.  iate-onset  AD  identified  13  of  our  fiitered  genes  with  suggestive  associations  {P<^0'^),  inciuding  ITM2C 
(P=1. 22x10""'),  a  gene  known  to  inhibit  the  processing  of  APP  by  biocking  access  to  aipha-  and  beta-secretase. 
Enrichment  anaiysis  of  the  iist  of  rare  conserved,  functionai  variants  showed  significant,  Benjamini  FDR-adjusted 
enrichment  for  severai  AD-reiated  processes  inciuding  the  ‘ECM-receptor  interaction’  and  ‘ABC  transporters’  KEGG 
pathways;  GO  terms  inciuding  ‘homophiiic  ceii  adhesion’  and  ‘microtubuie-based  movement’;  and  muitipie  INTERPRO 
‘cadherin’  ciasses. 

Conclusion 

Exome  sequencing  of  EOAD  pedigrees  identified  muitipie  rare  segregating  variants  with  potentiai  roies  in  AD 
pathogenesis,  severai  of  which  were  shared  in  two  or  more  famiiies. 
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Appendix  II: 


Whole-exome  sequencing  of  Hispanic  early-onset  Alzheimer  disease  families  identifies  rare  variants  in 
multiple  Alzheimer-related  genes.  C.  Reitz^,  B.  W.  Kunkle^,  B.  N.  Vardarajan^,  M.  A.  KohlF,  A.  C.  Naf,  P.  L 
Whitehead^,  W.  R.  Perry^,  E.  R.  Martin^,  G.  W.  Beecham^,  J.  R.  Gilbert^,  L  A.  Farrer^,  J.  L  Haines‘S,  G.  D. 
Schellenberg^,  M.  A.  Pericak-Vance^,  R.  P.  Mayeux^,  Alzheimer's  Disease  Genetics  Consortium  1)  Taub  Institute 
for  Research  on  Alzheimer's  Disease,  Columbia  University,  New  York,  NY,  USA;  2)  John  P.  Hussman  Institute  for 
Human  Genomics,  University  of  Miami,  Miami,  FL,  USA;  3)  Perelman  School  of  Medicine,  University  of 
Pennsylvania,  Philadelphia,  PA,  USA;  4)  School  of  Medicine,  Boston  University,  Boston,  MA,  USA;  5)  Center  for 
Human  Genetics  Research,  Vanderbilt  University,  Nashville,  TN,  USA. 

OBJECTIVE:  To  identify  novel  early-onset  Alzheimer  disease  (EOAD)  candidate  genes. 

BACKGROUND:  Mutations  in  APP,  PSENl  and  PSEN2  lead  to  familial  EOAD  and  accounting  for  60-70%  of 
familial  EOAD  and  ~11%  of  EOAD  overall,  leaving  the  majority  of  genetic  risk  for  this  form  of  Alzheimer  disease 
unexplained. 

DESIGN/METHODS:  We  performed  Whole-Exome  Sequencing  (WES)  on  55  individuals  in  19  Caribbean  Hispanic 
EOAD  families  previously  screened  negative  for  APP,  PSENl  and  PSEN2  to  search  for  rare  variants  contributing 
to  risk  for  EOAD.  Variants  were  filtered  for  segregating,  conserved  and  functional  rare  variants  (MAF<0.1%) 
assuming  both  autosomal  and  X-linked  dominant  models.  Filtered  loci  were  examined  for  implication  as  AD 
candidate  genes  by  comparison  to:  late-onset  Alzheimer  (LOAD)  susceptibility  genes,  biologically  relevant 
Alzheimer  KEGG  Pathway  genes,  candidate  genes  from  45  WESed  NH-White  EOAD  cases,  and  results  of  an 
Alzheimer's  Disease  Genetics  Consortium  (ADGC)  exome  chip  association  study. 

RESULTS:  2,225  variants  in  1,531  genes  passed  our  stringent  filtering  criteria,  including  308  genes  with  rare 
segregating,  conserved  and  functional  variants  in  two  or  more  families.  Frameshift  insertions-deletions  in 
ABCA7  and  HLA-DRBl,  a  nonframeshift  deletion  in  RIN3,  and  missense  variants  in  DSG2  and  PICALM,  all  LOAD 
susceptibility  genes,  were  discovered.  11  AD  KEGG  Pathway  genes  have  variants,  including  LRPl,  a  gene 
involved  in  cholesterol  transport  and  p-amyloid  modulation.  83  variant  carrying  genes  are  in  2+  Hispanic  and 
2+  Non-white  Hispanic  families,  including  the  AD-relevant  HLA-A  (associated  with  earlier  age-at-onset), 

CHST15  (a  potential  modulator  of  Abeta  toxicity),  and  NOTCH4  (a  presenilin  pathway  gene).  Exome  chip  results 
identified  variants  in  MICA  encoding  the  HLA-A  gene  and  previously  associated  with  LOAD  in  a  small  study,  as 
having  suggestive  association  (p=9. 10x10"^).  One  family  has  variants  in  both  HLA-A  and  MICA. 

CONCLUSIONS:  Exome  sequencing  of  Hispanic  EOAD  pedigrees  identified  multiple  rare  segregating  variants 
with  potential  roles  in  AD  pathogenesis,  several  of  which  were  shared  in  two  or  more  families. 
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